\section{Discussion}\label{sec:dis}

We discuss our results and answer the questions in this section.\\

\textbf{1. Show an example solution sequence for each algorithm for the largest size you tested}

For size $n = 10$ and A* on problem ``7126049853'':

\begin{quote}
\noindent
[7126049853,\_,\_], [126049853,7,\_], [26049853,17,\_], [6049853,217,\_],
[049853,217,6], [49853,217,06], [9853,4217,06], [853,4217,906],
[53,4217,8906], [3,54217,8906], [\_,354217,8906], [\_,8354217,906],
[\_,98354217,06], [0,98354217,6], [0,8354217,96], [0,354217,896],
[0,54217,3896], [0,4217,53896], [0,217,453896], [0,17,2453896],
[10,7,2453896], [210,7,453896], [210,47,53896], [210,547,3896],
[3210,547,896], [3210,47,5896], [43210,7,5896], [543210,7,896],
[543210,87,96], [543210,987,6], [6543210,987,\_], [6543210,87,9],
[6543210,7,89], [76543210,\_,89], [876543210,\_,9], [9876543210,\_,\_]
\end{quote}

For size $n = 10$ and RBFS on problem ``7126049853'':

\begin{quote}
\noindent
[7126049853,\_,\_], [126049853,7,\_], [26049853,17,\_], [6049853,217,\_], 
[049853,6217,\_], [49853,06217,\_], [9853,406217,\_], [853,9406217,\_], 
[53,89406217,\_], [3,589406217,\_], [\_,3589406217,\_], [3,589406217,\_], 
[\_,3589406217,\_], [3,589406217,\_], [\_,3589406217,\_], [3,589406217,\_], 
[\_,3589406217,\_], [3,589406217,\_], [53,89406217,\_], [53,9406217,8], 
[53,406217,98], [3,5406217,98], [3,406217,598], [3,06217,4598], 
[\_,06217,34598], [0,6217,34598], [0,217,634598], [0,17,2634598], 
[10,7,2634598], [210,7,634598], [210,67,34598], [3210,67,4598], 
[43210,67,598], [543210,67,98], [6543210,7,98], [76543210,\_,98], 
[76543210,9,8], [876543210,9,\_], [9876543210,\_,\_] 
\end{quote}

\textbf{2. Is there a clear preference ordering among the heuristics you tested considering the number of nodes searched and the total CPU time taken to solve the problems for the two algorithms?}

The A* algorithm with nonadmissible heuristic performs the best among other heuristic type and algorithm combinations. The admissible heuristics on the two algorithms should yield the same solution lengths and nonadmissible heuristics should yield solution lengths that are not less than the solution lengths from the admissible heuristic.\\

\textbf{3. Can a small sacrifice in optimality give a large reduction in the number of nodes expanded? What about CPU time?}

The non-admissible heuristic sacrifices optimality because it does not necessarily yield an optimal solution. However, the sacrifice in optimality can yield a large reduction in the number of nodes expanded and thereby reducing the overall CPU time. For instance, an admissible heuristic can severely underestimate the true cost and yield a large number of nodes expanded. However, even though a non-admissible heuristic overestimates the true cost, it could be the case that the overestimation is very little to the true cost. Furthermore, one could overestimate to relax the problem. Therefore, there can be a large reduction in the number of nodes expanded and thus a reduction in CPU time, as compared to the admissible heuristic. A concrete example is our 2nd nonadmissible heuristic, where A* performs exceptionally well in performance.\\

\textbf{4. How did you come up with your heuristic evaluation functions?}

We created several heuristic evaluation functions based on some intuitions and for the sake of testing. One of our first heuristic evaluation functions is a simple count of the number of disks in the first peg. Then we gradually took advantage of the goal state's structure and compared it to the current state in our later heuristic designs. Hence, we came up with our heuristic evaluation functions by designing easier ones first, then gradually evolving them into more complex ones. We finally settled on three heuristics, one admissible and two non-admissible for demonstration purposes. These were covered in the previous section. 

We briefly describe some of our other heuristics here, even if some were not that great:

\begin{description}
	\item[Heuristic a] Number of disks in the first peg. RBFS.
	\item[Heuristic b] Total number of disks MINUS the number of matches between goal state and first peg. RBFS.
	\item[Heuristic c] Double the total number of disks MINUS the number of matches between goal state and first peg, AND MINUS the streak length of the first peg to the goal state starting at the bottom. RBFS.
	\item[Heuristic d] Sum the distances of all current disk positions to their goal positions plus the lengths of the second and third pegs. A*.
	\item[Heuristic e] Same as heuristic d for RBFS.
	\item[Heuristic f] Same as heuristic d, but enlarge heuristic value by factor of 2. A*. 
	\item[Heuristic i] Same as heuristic d, but instead of summing the distances, take the max distance of every disk position to their goal position on the first peg. A*.
	\item[Heuristic j] Same as heuristic f for RBFS.
\end{description}

Note: since our two algorithms utilize different data structures, some heuristics are equivalent but use different data structures. Our organization was not the best.\\

\textbf{5. How do the two algorithms compare in the amount of search involved and the cpu-time?}

According to our results, in general, A* does a better job by expanding fewer nodes than RBFS and using less cpu-time. Furthermore, the nonadmissible heuristic does a better job than the admissible heuristic performance-wise. We can see these on our graphs, where RBFS has higher values of number of expanded nodes and cpu-time as the number of disks increases, and similarly for the admissible heuristic.\\

\textbf{6. Do you think that either of these algorithms scale to even larger problems? What is the largest problem you could solve with the best algorithm+heuristic combination? Report the wall-clock time, CPU-time, and the number of nodes searched.}

The algorithms could scale to even larger problems if the heuristic were designed very well. It is amazing to see an algorithm with one heuristic run very slowly with many node expansions while the same algorithm with a better heuristic finishing the job much faster with significantly fewer node expansions. Therefore, a carefully designed heuristic will allow both algorithms to scale to even larger problems by reducing the amount of expanded nodes and thereby reducing CPU time.

A smaller factor in scaling to larger problems is the algorithm design. A* may run out of memory eventually compared to RBFS since A* stores every node it expands, but this would take awhile since our computers contain a lot of memory. RBFS could scale further since it does not store every node in memory, but it will take longer to finish the job due to that trade-off. However, changing the heuristic would have the most impact because the overall number of expanded nodes would be reduced with a better heuristic, regardless of the algorithm chosen.

The largest problem we could solve is of disk size 10, using A* and the non-admissible. It took less than ten minutes. For the initial state of ``'', it was able to solve the problem in 1.41 seconds and 6536 expanded nodes, with a solution length of 34 steps. As a non-admissible heuristic, it may not necessarily be the optimal solution, but the problem was nonetheless solved.

We went further and tested disks sizes above 10. We tested up to size 15. Given an initial state of $[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14]$, the A* algorithm using the non-admissible heuristic was able to solve the problem in 2687 seconds and 987156 expanded nodes, with a solution length of 62 steps. That is about 44 minutes. We believe we can go higher, but it will take hours to complete. This demonstrates that our algorithms and heuristics can scale to even larger problems.\\

\textbf{7. Is there any tradeoff between how good a heuristic is in cutting down the number of nodes and how long it took to compute? Can you quantify it?}

A heuristic function $H1$ can take longer to compute than heuristic function $H2$, but if $H1$ significantly reduces the number of nodes, then the overall time to solve the problem could be less using $H1$ than $H2$. In every experiment, we measured every time it takes to compute a heuristic function and averaged them. To take one concrete example, the heuristic of counting the number of disks in the first peg takes shorter computation time than the heuristic of calculating matches between the first peg and the goal state. As the disk size increases, the computation time noticeably but slightly increases. However, the matching heuristic performs better than the counting heuristic because it reduces the number of expanded nodes significantly. Therefore, while a heuristic evaluation function may take longer to compute, it may reduce the overall number of expanded nodes to make a significant reduction in overall computation time of the problem.\\

